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DETAILED ACTION 

This office action is prepared in response to a Request for Continued Examination (RCE) 
filed on December 21, 2010. 

Claims 17, 19-26, 28-34, 36-41 and 43-52 were previously pending in the final action 
mailed on September 24, 2010. 

In the amendments filed on December 21, 2010, 
No claim is cancelled; 
Claims 17, 21, 32, 47 and 51 are amended; 
Claims 17, 19-26, 28-34, 36-41 and 43-52 are now pending. 

Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1.114, including the fee set forth in 
37 CFR 1.17(e), was filed in this application after final rejection. Since this application is 
eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) 
has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 
37 CFR 1.114. Applicant's submission filed on December 21, 2010 has been entered. 

Response to Amendments 

2. Applicant's arguments and amendments filed on December 21, 2010 have been carefully 
considered. However, the amendments and arguments failed to place the application in 
condition for allowance for the following reasons. 
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Applicant amended claims 17, 32, 47 and 50 to now recite "signal pause duration data 
specifying a length of a pause between audio events ", with the underlined part being the 
amendments. 

Examiner considers the amendment to make no substantial change to the scope of the 
claim because to one of ordinary skill in the art, "signal pause duration data" is data that specifies 
a length of a pause between audio events. The amendments merely recite the meaning of "signal 
pause duration data" without further limiting it. 

Furthermore, although Russell might not have explicitly disclosed that the signal pause 
duration data specifies a length of a pause between audio events, but it would have been within 
the knowledge of one of skilled in the art that said length of a pause between audio events could 
have been easily derived from the information in audio information data block. The difference 
between Russell and the instant claim element, if there is, is a matter of implementation choice. 

Therefore the ground of rejection previously presented by the Examiner still applies. 

Further regarding the claim amendments, Applicant amended claims 21 and 51 to recite 
"filter out a succession of information data blocks between two adjacent signal pause data 
blocks" where the word "adjacent" is added. 

Examiner considers this amendment to make no substantial change to the scope of the 
claim because in the nature of the relevant art of speech data segmentation, the speech data is 
segmented into information data blocks and signal pause data blocks in alternating order, with 
every information data block situated between two adjacent signal pause data blocks and vice 
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versa. Therefore, what is recited in the claim regarding " information data blocks between two 
adjacent signal pause data blocks" is implicit in the art of speech data segmentation. 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

3. Claims 17, 20, 26, 32-33, 41, 46 and 50 are rejected under 35 U.S.C. 103(a) as being 
anticipated by Russell et al. (U.S. patent No. 5,526,407, hereinafter "Russell"), in view of 
Yamamoto et al. (U.S. patent No.4,355,338, hereinafter "Yamamoto"). 

Regarding claim 17, Russell disclosed a method for digitally recording an analog audio 
signal, the method comprising: 

(a) receiving an analog audio signal (Russell, Fig. 2 and col. 9, line 30 disclosed "analog 
front end", which records analog signal) containing audio information and signal pauses 
information (a speech signal by nature contains speech information (i.e., audio) and non-speech 
information (i.e. signal pauses), as evident in Russell, col. 7, lines 4-5); 

(b) converting the analog audio signal into digital audio signal comprising audio 
information data and signal pause duration data (Russell. Fig. 2 and col. 9, lines 31-33); 

(c) storing the audio information data of the digital audio signal as information data 
blocks and the signal pause duration data of the digital audio signal as signal pause data blocks 
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having different time durations in a memoiy (Russell, col. 6, lines 43-53 disclosed storing the 
speech stream and structure which represents categorized portions of the speech stream; Russell, 
col. 15 lines 8-11 further disclosed that the characterization of speech includes the phrase time 
duration and the presence of pauses), 

wherein each information data block contains an information data block identifier and 
audio information data, and signal pause data block contains a signal pause data block identifier 
and signal pause duration data specifying a length of a pause between audio events (Russell, col. 
18, lines 2-13, "a data element" and "Phrase ID" or col. 18, lines 64-66, "phrase descriptor 
structure" and "phrase ID"; Russell might not have explicitly disclosed that the signal pause 
duration data specifies a length of a pause between audio events, but it would have been within 
the knowledge of one of skilled in the art that said length of a pause between audio events could 
have been easily derived from the information in audio information data block. The difference 
between Russell and the instant claim element, if there is, is a matter of implementation choice ); 

and the audio information data and the signal pause duration data of the resulted digital 
audio signal represent outputs at a normal speaking speed (Russell, col. 7, lines 4-7 and col. 7, 
lines 13-15. In particular, col. 7, lines 13-15 disclosed "When playing the recalled speech, the 
present invention may optionally skip the identified speech pauses and non-speech utterances," 
which implies that the pauses and non- speech utterances represent output at a normal speaking 
speed); 

(d) generating a plurality of audio information data sequences by sequentially reading the 
information data blocks and the signal pause data blocks (Russell, col. 14, lines 43-50 disclosed 
that a speech process program 136 separate the speech into phrases demarcated by perceptible 
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pauses; col. 17, line 67 disclosed "RIFF chunk" as an audio information data sequence), the 
audio information data sequences being separated by the signal pause data blocks if an assigned 
time duration of the signal pause data block is higher than a predetermined time duration (col. 
17, line 64 disclosed that the silence of a specified second duration marks a signal pause). 

(e) producing an index table by sequentially reading the information data blocks and the 
signal pause data blocks (Russell, Fig. 5 and col. 14, lines 43-67 and col. 11, lines 1-17, "tag 
tables" or Fig. 18 and col. 18, lines 53-61, "phrase descriptors"). 

Russell did not explicitly disclose receiving an analog audio signal played at an increased 

speed. 

However, Yamamoto disclosed a method for fast reproduction of music or language 
tapes, where it is know in the prior art that to fast, mass reproduce music or language tapes, the 
source tape can be driven at a high speed (such as 32 times the normal speed) (see Yamamoto, 
col. 1, lines 19-22 and lines 38-45). 

One of ordinary skill in the art would have been motivated to combine Russell and 
Yamamoto because both disclosed system and method for converting an analog signal to a 
digital signal using analog-to-digital converter (Russell, Fig. 2, "Analog Front-end"; Yamamoto, 
Fig. 2, "A/D 9"). 

Therefore, It would have been obvious for one skilled in the art to apply Yamamoto's 
teaching to Russell such that pre-recorded audio signal is loaded into the processing device 43 in 
Fig. 3 at reduced time, allowing the speech processing modules (Fig. 4, "voice print info", 
"speech extraction", "codec" codec) sufficient time to process the speech and produce output 



Application/Control Number: 10/031,471 
Art Unit: 2442 

without introducing a noticeable delay to the user, 
result of providing better user experience. 
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The combination yields the highly desirable 



Regarding claim 20, the combination of Russell and Yamamoto disclosed the method of 
claim 17. 

Russell further disclosed wherein producing the index table comprises processing the 
sequentially read data blocks (col. 14, lines 43-67 and col. 11, lines 1-17). 

Regarding claim 32, Russell disclosed a method for digitally recording an analog audio 
signal with automatic indexing, the method comprising: 

(a) receiving an analog audio signal containing audio information and signal (Fig. 2 and 
col. 9, line 30 disclosed "analog front end", which records analog signal; a speech signal by 
nature contains speech information (i.e., audio) and non-speech information (i.e. signal pauses), 
as evident in the disclosure in col. 7, lines 4-5); 

(b) converting the analog audio signal into digital audio data comprising audio 
information data and signal pause duration data specifying a length of a pause between audio 
events (Fig. 2 and column 9, lines 31-33; Russell might not have explicitly disclosed that the 
signal pause duration data specifies a length of a pause between audio events, but it would have 
been within the knowledge of one of skilled in the art that said length of a pause between audio 
events could have been easily derived from the information in audio information data block; The 
difference between Russell and the instant claim element, if there is, is a matter of 
implementation choice) ; 
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(c) storing the converted digital audio data (col. 6, lines 44-45, "stores the speech stream 
in at least a temporary storage"); 

(d) reading the stored digital audio data sequentially (col. 14, lines 48-49 disclosed that 
the speech process program 136 allocates buffers to receive the real-time speech and separate the 
speech into phrases, which implies that the speech process program 136 must read the real-time 
speech sequentially); 

(e) decoding whether the digital audio data are audio information data or signal pause 
duration data (col. 15, lines 46-53 and col. 17, lines 63-65); 

(f) storing the audio information data as information data blocks and the signal 
pause duration data as signal pause data blocks in a memory (column 6, lines 43-53 disclosed 
storing the speech stream and structure which represents categorized portions of the speech 
stream; col. 15 lines 8-11 further disclosed that the characterization of speech includes the phrase 
time duration and the presence of pauses), 

wherein each information data block contains an information data block identifier and 
audio information data, and signal pause data block contains a signal pause data block identifier 
and signal pause duration data (Russell, col. 18, lines 2-13, "a data element" and "Phrase ID" or 
col. 18, lines 64-66, "phrase descriptor structure" and "phrase ID"); and 

(g) reading the stored data blocks sequentially in order to produce a data structure for 
managing the indexing (col. 14, lines 43-50 disclosed that a speech process program 136 separate 
the speech into phrases demarcated by perceptible pauses; col. 17, line 67 disclosed "RIFF 
chunk" as an audio information data sequence), 
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wherein a succession of information data blocks which is not interrupted by a signal 
pause with a pre-determined duration being detected as an audio information data sequence 
whose start and end are stored in the data structure for managing the indexing (col. 17, lines 63- 
64 disclosed that sound of at least a certain first threshold duration followed by silence of a 
specified second duration indicates that a phrase has been completed; and the time the phrases 
began and ended are recorded). 

(h) producing an index table by sequentially reading the information data blocks and the 
signal pause data blocks (Russell, Fig. 5 and col. 14, lines 43-67 and col. 11, lines 1-17, "tag 
tables" or Fig. 18 and col. 18, lines 53-61, "phrase descriptors"). 

Russell did not explicitly disclose receiving the analog audio signal played at an 
increased speed. 

However, Yamamoto disclosed a method for fast reproduction of music or language 
tapes, where it is know in the prior art that to fast, mass reproduce music or language tapes, the 
source tape can be driven at a high speed (such as 32 times the normal speed) (see Yamamoto, 
col. 1, lines 19-22 and lines 38-45). 

One of ordinary skill in the art would have been motivated to combine Russell and 
Yamamoto because both disclosed system and method for converting an analog signal to a 
digital signal using analog-to-digital converter (Russell, Fig. 2, "Analog Front-end"; Yamamoto, 
Fig. 2, "A/D 9"). 

Therefore, It would have been obvious for one skilled in the art to apply Yamamoto's 
teaching to Russell such that pre-recorded audio signal is loaded into the processing device 43 in 
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Fig. 3 at reduced time, allowing the speech processing modules (Fig. 4, "voice print info", 
"speech extraction", "codec" codec) sufficient time to process the speech and produce output 
without introducing a noticeable delay to the user. The combination yields the highly desirable 
result of providing better user experience. 

Claim 47 is rejected under the same rationale as claim 32 as it lists elements that are all 
listed in claim 32 and disclosed by Russell. 

Claim 50 is rejected under the same rationale as claim 32 as it lists elements that are all 
listed in claim 32 and disclosed by Russell. 

Regarding claims 26 and 41, the combination of Russell and Yamamoto disclosed the 
method of claims 17 and 32. 

Russell further disclosed wherein the digital audio data are compressed before storage 
(col. 9, lines 39-43, "codec" and col. 14, lines 66, "to compress the speech"). 

Regarding claim 33, the combination of Russell and Yamamoto disclosed the method of 
claim 32. 

Russell further disclosed wherein the data structure produced for managing the indexing 
is an index table (Fig. 5 and col. 11, lines 1-27). 
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4. Claims 19 and 34 are rejected under 35 U.S.C. 103(a) as obvious over Russell and 
Yamamoto, in view of Welch et al. (U.S. Patent No. 4,336,421, hereinafter "Welch"). 

Regarding claims 19 and 34, the combination of Russell and Yamamoto disclosed the 
method of claims 17 and 33. 

Russell further disclosed the start and end of an audio information data sequence are 
stored as start and end address (Russell, Fig. 5 and column 18). 

Russell did not explicitly disclose that the start address and end address are for a first 
address pointer and a second address pointer of the index table. The examiner interprets the first 
address pointer and the second address pointer as the pointer to the start and end of an audio 
information data in the memory, as it is unclear to the examiner what a first address pointer and a 
second address pointer of the index table entails. 

However, in the same field of endeavor, Welch disclosed storing the start and end of a 
speech segment as the start and end addresses in the memory (Welch, col. 13, lines 42-64). 

One of ordinary skill in the art would have been motivated to combine Russell and Welch 
because both disclosed detecting voice sounds and pauses in speech (Russell, "Summary of the 
Invention" and Welch, "Summary of the Invention"), and Welch supplemented Russell's 
teaching with implementation details relating to buffer management for the speech data. 

Therefore, it would have been obvious for one to combine Russell and Welch such that 
Russell's invention can be embodied using techniques taught by Welch. 
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5. Claims 21-23, 30, 36-38, 45, 48 and 51 are rejected under 35 U.S.C. 103(a) as obvious 
over Russell and Yamamoto, in view of Freudberg et al. (U.S. Patent No. 4,696,031, hereinafter 
Freudberg"). 

Regarding claims 21, 36, 48 and 51, the combination of Russell and Yamamoto 
disclosed the method of claims 20, 33, 47 and 50. 

Russell further disclosed filtering out a succession of information data blocks (i.e. short 
spoken utterances) between two adjacent signal pause data blocks that are not useful for the user 
(Russell, col. 16, lines 21-46). 

Russell did not explicitly disclose filtering a particular minimum value for a number of 
information blocks doe not exceed and a particular first time limit value the signal pause of the 
two adjacent signal pause data blocks exceeds. 

However, Freudberg disclosed filtering out short bursts of energy by combining an ON 
time of less than 200 msec with the OFF times of the two adjacent OFF intervals to form a single 
OFF interval, which is essentially the same as what's described in the claim (column 6, lines 32- 
49). 

One would have been motivated to combine Russell and Freudberg because both 
disclosed silence (i.e. pause) detection and speech signal segmentation (Russell, col. 17, lines 60- 
65; Freudberg, col. 3, lines 28-40, "signal-ON" and "signal-OFF"). 

Therefore, it would have been obvious for one to incorporate Freudberg 's method of 
filtering out short energy bursts using threshold values into Russell to filter out information block 
that are falsely detected as speech blocks so as to save system processing time and reduce error 
rate. 
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Regarding claims 22 and 37, the combination of Russell, Yamato and Freudberg 
disclosed the method of claims 21 and 36. 

Russell did not explicitly disclose wherein the minimum value is 1. 

Freudberg disclosed the minimum value for ON time is 200 msec (Freudberg, col. 6, 
lines 32-49). 

Examiner considers the difference in the minimum value as an implementation choice 
that may vary in different embodiments of the same inventive idea. 

The rationale for the motivation to combine Russell and Freudberg is the same as that 
provided in the rejection of claim 21. 

Regarding claims 23 and 38, the combination of Russell, Yamamoto and Freudberg 
disclosed the method of claims 21 and 36. 

Russell did not explicitly disclose wherein the first time limit value is 0.5 seconds. 

Freudberg disclosed using an OFF interval (Freudberg, col. 6, lines 50-55). 

Examiner considers the value of 0.5 seconds specified in the claim as an implementation 
choice of Freudberg 's OFF interval, which may vary in different embodiments of the same 
inventive idea. 

The rationale for the motivation to combine Russell and Freudberg is the same as that 
provided in the rejection of claim 21. 
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Regarding claims 30 and 45, the combination of Russell and Yamamoto disclosed the 
method of claims 17 and 32. 

Russell did not explicitly disclose wherein a succession of information data blocks which 
is not separated by a signal pause data block whose signal pause duration data amount to a signal 
pause of more than 2 seconds is detected as an audio information data sequence. 

However, Freudberg disclosed a signal pause detection method using an ON time to 
determine the minimum duration of speech intervals (Freudberg, col. 6, lines 32-49). 

Examiner considers the time duration of 2 seconds recited in the present claim as an 
implementation choice Freudberg's ON time, which may vary in different embodiments of the 
same inventive concept. 

6. Claims 24-25, 31, 39-40, 46, 49 and 52 are rejected under 35 U.S.C. 103(a) as obvious 
over Russell and Yamamoto, in view of Imai et al. (U.S. 2001/0010037, hereinafter "Imai"). 

Regarding claims 24, 39, 49 and 52, the combination of Russell and Yamamoto 
disclosed the method of claims 20, 33, 48 and the apparatus of claim 50. 

Russell did not explicitly disclose while processing the data, overwriting signal duration 
data of signal pause data blocks whose signal pause duration exceeds a particular second time 
limit value with signal duration data having particular nominal signal duration. 

However, Imai (2001/0010037) disclosed a speech rate conversion system that replaces a 
non- speech interval exceeding a constant continued time with a break of the constant continued 
time that is shorter than the actual non- speech interval (Imai, [0034]). 
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One would have been motivated to combine Russell and Imai because both disclosed 
non- speech interval detection. 

Therefore, it would have been obvious for one to add Imai's speech rate conversion to 
Russell to save storage space and adjust playback speed without losing audio information, 

Regarding claims 25 and 40, the combination of Russell, Yamamoto and Imai disclosed 
the method of claims 24 and 39. 

Imai further disclosed wherein the second time limit value is 10 seconds and the nominal 
signal duration is 2 seconds (Examiner considers the threshold values recited in the claim as an 
implementation choice of Imai's constant continued time disclosed in [0034] which may vary in 
different embodiments of the same inventive idea). 

The rationale for the motivation to combine Russell and Imai is the same as that provided 
in the rejection of claim 24. 

Regarding claims 31 and 46, the combination of Russell and Yamamoto disclosed the 
method of claims 17 and 32. 

Russell did not explicitly disclose wherein, when receiving the analog audio signal, a 
playing speed of a data medium on which the analog audio signal is recorded can be set. 

However, Imai disclosed a method for converting speech rate at a preset scaling factor 
using speech interval detection (Imai, "Abstract"). 

The rationale for the motivation to combine Russell and Imai is the same as that provided 
in the rejection of claim 24. 
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7. Claims 28-29 and 43-44 are rejected under 35 U.S.C. 103(a) as obvious over Russell and 
Yamamoto, in view of Gan et al. (IEEE Publication "Implementation of Silence Compression 
Scheme For G.723.1 Speech Coder Using TI TMS320S75 DSP Chip", 1997, hereinafter "Gan"). 

Regarding claims 28 and 43, the combination of Russell and Yamamoto disclosed the 
method of claims 17 and 32. 

Russell did not explicitly disclose wherein all the data blocks are of a same size and 
correspond to a particular basic unit of duration. 

However, Gan disclosed an implementation of silence compression for G.723.1 coder, 
wherein G.723.1 coder has frame size of 30 ms and compressed frame size of 24 bytes at 6.3 
kb/s coding rate. 

One would have been motivated to combine Russell and Gan because both disclosed 
silence detection and codec. 

It would have been obvious for one to incorporate Gan's silence compression and 
G.723.1 into Russell as one of the possible embodiments of Russell's speech peripheral. 

Regarding claims 29 and 44, the combination of Russell, Yamamoto and Gan disclosed 
the method of claims 28 and 43. 

Russell did not explicitly disclose wherein the basic unit of duration is 30 ms. 

However, as already addressed above in the rejection of claim 28, Gan disclosed a silence 
compression implementation for G.723.1 codec, where the G.723.1. codec has basic frame 
duration of 30 ms. 
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The rationale for the motivation to combine Russell and Gan is the same as that provided 
in the rejection of claim 28. 



Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to SHIRLEY X. ZHANG whose telephone number is (571) 270- 
5012. The examiner can normally be reached on Monday through Friday 7:30am - 5:00pm EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Glen Burgess can be reached on (571) 272-3949. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



/Shirley X Zhang/ 
Examiner, Art Unit 2442 
2/25/2011 



